In IRT a peron’s true score is defined independently of the specific test (Hambleton & Jones)
Imagine two non-parallel maths test (one is easier)
Important in education – we can define math “ability” independently of the specific math test given
\[ P_j(\theta) = \text{Prob}(X_j = 1 \mid \theta)\]
Many IRT models use the a logistic model for \(P_j\)
Logistic maps a variable on to the unit interval (0,1)
\[ p = \frac{\exp(x)}{1 + \exp(x)} \]
\[ x = \log \frac{p}{1 - p} \]
\[P_j(\theta) = \frac{\exp(a_j(\theta - b_j)}{ 1 + \exp(a_j(\theta - b_j))} \]
\[ \text{logit}(P_j(\theta)) = a_j(\theta - b_j) \]
\[ P_j(b_j) = \frac{\exp(a_j (0))} {1 + \exp(a_j (0))} = 1/2. \]
Interpretation: \(b_j\) is the value of \(\theta\) at which the proability of endorsing an item = 1/2.
Respondents with ability above the difficulty level of the item have probability > 1/2 of answering the item correctly, and conversely.
[1] -0.967 -1.826 -0.698 -0.124
\[ \frac{\partial}{\partial \theta} P_j(\theta) = a_j P_j(\theta) (1 - P_j(\theta)) \]
[1] 1.527 2.282 3.240 2.269
In IRT, the concept of (Fisher) information takes a central role
Information is the precision with which we can esimate \(\theta\) given the observed response data
\[I(\theta) = 1 / (SE[\theta])^2\]
Interpretation: minimum zero, big values are good!
In IRT, information takes the central role rather than reliability
The item information function (IIF) is the precision that results when estimating the latent trait using a single item
In practice, we would never use only a single item on a test
But, we can build up the information function of the entire test from that of each individual item
So, we start with the IIF and then use that to get the test information function (TIF)
\[I_j(\theta) = a_j^2 P_j(\theta) (1 - P_j(\theta)) \]
Very similar to slope!
Interpretation:
\[ I(\theta) = \sum_{j = 1}^{J} I_j(\theta) \]
Interpretation: The information provided by a test is just the sum of the information of its individual items!
Note: this requires an assumption called conditional independence we will discuss next week
Information is not easy to interpret
Used mainly for comparisons among different tests
To report out for a single test, more usual to use reliability, but now as a function of \(\theta\)
\[ R(\theta) = \frac{1}{1 + I(\theta)} \]
Sometimes it is still desirable to have a single number summary for reliability
For this, we can take the average of the reliability function
Usually called marginal IRT reliability, but could also be called average reliability
Sort of defeats the purpose of IRT…
load("ECDI_learning.RData")
library(mirt)
# Fit model
fit.2pl <- mirt(ecdi_learning, verbose = F)
# Model params
coef(fit.2pl, IRTpars = T, simplify = T)
# Plots
plot(fit.2pl, type = "trace", facet = F) # IRFs
plot(fit.2pl, type = "infotrace", facet = F) # IIFs
plot(fit.2pl, type = "info") # TIF
plot(fit.2pl, type = "rxx") # Reliability
# Marginal reliability
marginal_rxx(fit.2pl)